Adaptating the Levenshtein Distance to Contextual Spelling Correction

نویسندگان

Aouragh Si-Lhoussain

Hicham Gueddah

Abdellah Yousfi

چکیده

In the last few years, computing environments for human learning have rapidly evolved due to the development of information and communication technologies. However, the use of information technology in automatic correction of spelling errors has become increasingly essential. In this context, we have developed a system for correcting spelling errors in the Arabic language based on language models and Levenshtein algorithm. The metric distance returned by the Levenshtein algorithm is often the same for multiple solutions in correcting a wrong word. To overcome this limitation we have added a weighting based on language models. This combination has helped us to screen and refine the results obtained in advance by the Levenshtein algorithm, and applied to the errors of Arabic words. The results are encouraging and demonstrate the value of this approach.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Introduction of the weight edition errors in the Levenshtein distance

In this paper, we present a new approach dedicated to correcting the spelling errors of the Arabic language. This approach corrects typographical errors like inserting, deleting, and permutation. Our method is inspired from the Levenshtein algorithm, and allows a finer and better scheduling than Levenshtein. The results obtained are very satisfactory and encouraging, which shows the interest of...

متن کامل

Introduction of the weight edition errors in the Levenshtein distance

—In this paper, we present a new approach dedicated to correcting the spelling errors of the Arabic language. This approach corrects typographical errors like inserting, deleting, and permutation. Our method is inspired from the Levenshtein algorithm, and allows a finer and better scheduling than Levenshtein. The results obtained are very satisfactory and encouraging, which shows the interest o...

متن کامل

Practical Methods for Approximate String Matching

Given a pattern string and a text, the task of approximate string matching is to find all locations in the text that are similar to the pattern. This type of search may be done for example in applications of spelling error correction or bioinformatics. Typically edit distance is used as the measure of similarity (or distance) between two strings. In this thesis we concentrate on unit-cost edit ...

متن کامل

Fuzzy lexical matching

Being able to automatically correct spelling errors is useful in cases where the set of documents is too vast to involve human interaction. In this bachelor's thesis, we investigate an implementation that attempts to perform such corrections using a lexicon and edit distance measure. We compare the familiar Levenshtein and Damerau-Levenshtein distances to modi cations where each edit operation ...

متن کامل

Error Correction for Arabic Dictionary Lookup

We describe a new Arabic spelling correction system which is intended for use with electronic dictionary search by learners of Arabic. Unlike other spelling correction systems, this system does not depend on a corpus of attested student errors but on studentand teacher-generated ratings of confusable pairs of phonemes or letters. Separate error modules for keyboard mistypings, phonetic confusio...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

IJCSA

دوره 12 شماره

صفحات -

تاریخ انتشار 2015

Adaptating the Levenshtein Distance to Contextual Spelling Correction

نویسندگان

چکیده

منابع مشابه

Introduction of the weight edition errors in the Levenshtein distance

Introduction of the weight edition errors in the Levenshtein distance

Practical Methods for Approximate String Matching

Fuzzy lexical matching

Error Correction for Arabic Dictionary Lookup

عنوان ژورنال:

اشتراک گذاری